Compression and decompression coding scheme and apparatus

ABSTRACT

In a compression and decompression coding method, arrangement and computer program product, a data signal containing a number of symbols is converted into a series of codewords. A set of codewords is established and the data signal is monitored to determine the most frequently occurring symbols therein and/or sequences of symbols therein containing at least two symbols. A codeword is then allocated to each of the most frequently occurring of the symbols and/or symbol sequences. At least one codeword is reserved for indicating uncompressed data. When compressing a signal, the incoming symbols are first checked to determine if they correspond to a codeword. If a symbol corresponds to more than one codeword, further symbols are read until a symbol occurs which corresponds to one codeword only. That codeword is then transmitted. Any symbol that does not correspond to a codeword is supplemented with a codeword indicative of no compression and is then transmitted.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates generally to a coding scheme and apparatusfor the compression and decompression of data. It is particularlydirected to the compression of signals that exhibit so-called memory,where a portion of a signal depends on the value of a preceding portion.The invention has particular application to medical systems, such asimplantable pacemaker devices, which have limited memory but require thestorage of large quantities of data.

2. Description of the Prior Art

Medical systems for monitoring physiological functions are becoming morecomplex as the need for diagnostic applications increases. In particularthere is a need for intracardial detection systems and pacemaker systemscapable of storing ever increasing numbers of signals, such aselectrocardiogram signals (EGG and IEGM), pressure signals andbioimpedence signals or the like, of ever increasing length. However,the available memory space for data storage is often restricted,particularly in implanted pacemaker systems. Perhaps more importantly inimplanted systems, the amount of data that may be collected is alsorestricted by the transmission capacity of a telemetry link between animplanted device and its programmer or other external control device.For example, a defibrillator today typically requires a transmissiontime of up to 40 minutes for downloading to its controller all the datathat can be collected. If the required quantities of data are to be madeavailable for processing, this data must be compressed.

Data compression can generally be divided into two forms. A first formis based on viewing a signal as a mathematical function and observingand utilizing characteristics of this function to compress data. Thesecond form utilizes coding theory and is based on the statistics ofmultiple discrete signal levels, or symbols, occurring in a signal.

A conventional algorithm working according to this latter principle isthe Tunstall code. This code maps variable-length symbol segments of aninput signal into fixed-length codewords. Since the codeword length isfixed, the number of codewords n is known in advance. The object is toassign codewords to symbol segments that occur with approximately equalprobability. The procedure begins with a set of symbol segmentsconsisting of each of the individual symbols occurring in the inputsignal, such as m symbols in total. The most probable symbol is thenremoved from the set and replaced by m new segments, each of which isthe removed symbol suffixed by one of the m input symbols. Thisprocedure continues until the number of symbol segments, Manager, in theset is equal to the number of codewords, n. The codewords are thenassigned to the symbol segments.

An example of Tunstall encoding is illustrated in FIGS. 1 and 2. FIG. 1shows a probability tree comprising nodes, and branches emanating fromeach node. In the illustrated example it is assumed that S codewords areavailable for compressing data, 1 to 8. The tree in FIG. 1 assumes thata source signal comprises two signals, ‘0’ and ‘1’. Each symbol isrepresented by a first branch. The probability a symbol occurring isgiven at the node terminating the associated branch. Hence a ‘0’ occurswith a probability of 0.6 and a ‘1’ occurs with a probability of 0.4.The branch with the highest probability is expanded further by adding asecond series of branches for each possible symbol. The ‘0’ branch isthus bifurcated into a second ‘0’ branch with a total probability of0.36 and a ‘1’ branch with probability 0.24. The first branch for symbol‘1’ now has the highest probability and is expanded in turn, resultingin two further branches with probability of 0.24 and 0.16. This processcontinues until the number of end branches equals the number ofcodewords, in the present example 8, with the probability of each branchoccurring being relatively close. Each end branch represents a sequenceof symbols. These sequences are assigned a codeword as shown in FIG. 2.

A problem associated with Tunstall encoding is that signals containing alarge number of different symbols, for instance a large number ofdiscrete signal levels, require a very large number of codewords. Forexample, in a pacemaker system, a typical electrogram or IEGM signal isrepresented by S bits sampled at 512 Hz. The number of symbols in thissignal is thus 256. A Tunstall codebook for this signal would have tocontain at least 256 codewords just to cover the different individualsymbols. In order to obtain compression of the signal, the branches mustbe expanded further which adds a further 256 codewords per branch. TheTunstall code is a general-purpose data compression algorithm and is notadapted to special classes of data. In particular it is not effectivewhen applied to signals which exhibit memory. Typically, many of thesignals monitored by medical systems, and cardiac pacers in particular,exhibit some memory, one example being the IEGM signal. The use ofTunstall encoding for processing sampled data in medical systems, andspecifically implanted systems, is thus limited.

SUMMARY OF THE INVENTION

It is an object of the present invention to provide a coding method andapparatus that allows a signal exhibiting memory to be compressed anddecompressed efficiently and without distortion.

The above object is achieved in accordance with the present invention ina data coding method for converting a signal containing a plurality ofsymbols into codeword, and an apparatus and software product forimplementing the method, including the steps of: establishing a set ofcodewords, monitoring a data signal and determining the most frequentlyoccurring symbols and/or sequences of symbols containing at least twosymbols, allocating one codeword to each of the most frequentlyoccurring of said symbols and/or symbol sequences.

According to a further aspect, the invention provides a data compressionmethod for compressing a data signal containing a plurality of symbols,including: converting the most frequently occurring symbols and/orsymbol sequences into codewords, supplementing the remaining symbolswith at least one codeword indicative of no compression.

The invention further proposes an arrangement for compressing anddecompressing a data signal containing a plurality of symbols,including: means for storing codewords corresponding to symbols and/orsymbol sequences, and means for determining if a symbol in said datasignal corresponds to at least one codeword in the storage means and,when a symbol corresponds to only one codeword, for transmitting saidcodeword, wherein the determining means are further adapted to transmita symbol if it corresponds to no codeword in the storage means.

According to a fourth aspect, the invention proposes a computer programproduct for converting a signal containing a plurality of symbols into acompressed signal, including computer readable program code means forestablishing a set of codewords, determining the most frequentlyoccurring symbols and/or sequences of symbols containing at least twosymbols in a data signal and allocating one codeword to each of the mostfrequently occurring of said symbols and/or symbol sequences.

By providing codewords to only the most frequently occurring symbols andsymbols sequences, the number of codewords required is greatly reduced.At the same time however the efficiency of the compression is increased,as these codewords are allocated to the varying length symbol sequencesthat appear with the highest frequency in the signal. This compressiontechnique furthermore fully exploits any memory in a signal. A furtherdifference over prior art coding schemes and the Tunstall code inparticular is that codewords may be allocated to every node and endbranch in a coding tree and not just the end branches. This is alsoparticularly useful when compressing signals that exhibit memory such assignals monitoring physiological quantities such as the heartbeat,respiration rate and the like.

DESCRIPTION OF THE DRAWINGS

FIG. 1, as noted above, depicts a coding tree illustrating the Tunstallencoding algorithm.

FIG. 2, as noted above, depicts a coding table corresponding to thecoding tree of FIG. 1.

FIG. 3 is a schematic block diagram of an encoder/decoder(compressor/decompressor) according to the present invention.

FIGS. 4 a through 4 e are a series of histograms illustrating thegeneration of a codebook according to the present invention.

FIG. 5 depicts a coding tree corresponding to the histograms of FIGS. 4a through 4 e.

FIG. 6 depicts a codebook corresponding to the coding tree of FIG. 5.

FIG. 7 is a schematic illustration of a pre-processing function inaccordance with the present invention.

FIG. 8 is a schematic diagram of an arrangement for compressing anddecompressing data according to the present invention.

FIG. 9 is a schematic illustration of a codebook memory illustrating themapping between symbols and memory locations in accordance with theinvention.

FIG. 10 is a flowchart illustrating a method for compressing datautilizing the inventive arrangement of FIG. 8.

FIGS. 11 a through 11 e are a sequence of graphs illustrating thecompression and subsequent decompression of an IEGM signal, inaccordance with the present invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The compressing and decompression scheme according to the presentinvention is based on the representation of a data signal containing aplurality of symbols into coded form utilizing codewords of fixedlength. The symbols contained in an input signal may be digitalrepresentations of characters, such the ASCII format. Typically,however, the symbols will be binary representations of discrete signallevels in a sampled analogue signal, such as an ECG, IEGM or othersignals for monitoring physiological activity.

FIG. 3 schematically illustrates the function of an encoder/decoder forcompressing and decompressing a data signal. The data signal contains aplurality of symbols which are read by a symbol reader I and relayed toan encoder/decoder 2. The encoder/decoder 2 has access to a storagemedium 3 containing a codebook of the fixed-length codewords. The inputsymbols or symbol sequences are mapped to codewords in the codebook 3and replaced by the corresponding codeword. The reduction of symbols andsymbol sequences of variable length into codewords results in thecompression of the data. Decompression is accomplished by inverting theoperation. Codewords are passed through the encoder/decoder 2, whichwith the aid of the mapping to the codebook 3 reconstitutes the originalinput sequence.

The generation of a codebook according to the present inventioncommences with determining the number of codewords to be used. This isselected as a function of the total number of symbols contained in aninput signal and the degree of compression required.

The codebook is generated by observing the input signal during a testphase and determining the probabilities of the symbols and sequences ofsymbols occurring. This is accomplished by observing the input signalduring a set time period, for example 20 s, and noting the symbol thatoccurs with the highest frequency during this period. The signal is thenobserved again for the same time period and the symbol that follows thenoted symbol most frequently is noted. This process continues until thenumber of most frequently occurring symbols and/or symbol sequencesnoted is equal to the number of codewords in the codebook. It will beunderstood, that only those symbols or symbol sequences that occur mostfrequently will be coded. This allows the number of codewords to be keptto a reasonable number.

A simplified example of this process is illustrated in FIGS. 4 a–4 e, 5and 6 with reference to the generation of a codebook containing 8symbols. The input signal consists of 5 symbols 1, 2, 3, 4, 5.

Turning now to FIG. 4 a, a first histogram is produced after observingthe input signal for a set period. It is evident that the mostfrequently occurring symbol is 3 with an occurrence of 50. FIG. 4 bshows a histogram produced after the same period of time and gives thefrequency of the different symbols that follow 3. Thus the sequence 32has occurred ten times, the sequence 33 fifteen times, and the sequence34 twelve times. FIG. 4 c shows a histogram of the symbols following 2and FIG. 4 d shows a histogram of the symbols following the symbolsequence 3, 3. Finally FIG. 4 e shows the histogram of the symbolsfollowing the symbol 4. The codebook is established by determining whichof the 8 symbols and/or symbol sequences occurred most frequently. Thisis illustrated schematically in a code tree in FIG. 5. In this codetree, a codeword has been assigned to every node and end branch. Hencethe symbol 3 has been allocated the codeword ‘2’ but the symbol sequence33 has also been allocated a codeword, ‘5’ and the symbol sequence 332has been allocated the codeword ‘8’. This is summarized in tabular formin FIG. 6.

It is apparent from the above example that codewords are not assigned toevery symbol of an input signal. Thus when these unassigned symbols areread by an encoder, the symbol is transmitted uncompressed. Todistinguish the uncompressed data from the compressed data, it ispreferable to utilize some form of distinguishing symbol or codeword.Thus at least one codeword will be reserved for transmitting uncodedinput symbols.

Depending on the signal to be compressed, it may be desirable topre-process the signal in order to increase the frequency of a very fewsymbols and sequences. This considerably increases the efficiency of thecompression. For example, pre-processing can be very effective whencompressing sampled IEGM, EGG or other physiological signals thatmonitor some form of periodic activity.

FIG. 7 schematically illustrates the preferred function of such apre-processor. In FIG. 7 an input data stream denoted by 10 comprisesthe symbols IEGMn−1, IEGMn, IEGMn+1 and IEGMn+2. The function generatesthe difference symbol value between a symbol and a preceding symbol. Theoutput data stream 12 thus comprises the symbols (IEGMn−IEGMn–1) and(IEGMn−1–IEGMn−2), etc.

In a preferred embodiment of the present invention, the compressioncoding scheme according to the invention is used to compress an IEGMsignal comprising 255 symbols ranging from 0 to 254. Afterpre-processing in accordance with the function of FIG. 7, the signalcontains 509 possible symbols, ranging from −254 to +254. However, whilethe number of possible symbols has almost doubled, the form of theoriginal IEGM signal is such that the processed signal contains mainlysymbols close to 0 such as 1, −1, 2, −2 etc. Thus the concentration of avery few symbols and symbol sequences is increased by the differencefunction.

A codebook generated for a pre-processed IEGM signal of the typedescribed above can typically be efficiently compressed utilizing acodebook containing 254 codewords, for example 254 8-bit words rangingfrom 0 to 253. The 254 most frequently occurring symbols and/or symbolsequences are then converted into codewords. The 8-bit words with values254 and 255 are then reserved. These are utilized to signal whether datais compressed or uncompressed. Preferably, one reserved codeword, forexample 255, is sent to indicate that uncompressed data follows. Theother codeword 254 can then be utilized to signal that compressed datais following. To avoid having to generate different symbols for negativeand positive symbols, an uncompressed negative symbol is preferablyindicated by preceding it with both symbols sent contiguously, forexample, 255 followed by 254 followed by the uncompressed equivalentpositive symbol.

The codebook is preferably generated for a class of signals and retainedfor different signals of the same class so that it does not need to bere-established prior to compressing data. For example, a cardiac pacercontaining data compression and decompression circuitry for processingsome physiological measurement such as an IEGM signal, EGG signal orbioimpedence signal could be subjected to a training phase for eachpatient to establish a codebook that is specific to each patient. Thetraining phase could also be performed using a test sequence that isrepresentative of the class of signal, so that one codebook is used forseveral patients. A further option would be to utilize an adaptiveprocedure, whereby the statistics of the signal are observed and acodebook newly generated prior to each compression. This codebook wouldthen be optimized for each signal generated in a pacemaker. However, itwill be understood that the codebook must be retained long enough toenable the compressed data to be decompressed.

In the coding procedures described above, a codeword is allocated toevery node and end branch of a probability tree. It will, however, beunderstood that codewords could be assigned only to the end branches ofthe tree. The efficiency of such a code would depend on thecharacteristics of the starting signal.

An arrangement for encoding and decoding a signal is illustrated in FIG.8. The arrangement includes an input stage 20 for reading a symbol. Aprocessor 21, that preferably takes the form of a single chipmicroprocessor with associated memory, is coupled to the input stage. Acodebook memory 22 having 256 8-bit memory locations with addresses from0 to 255 is connected to the processor 21. An output stage 23 for thecodewords is coupled to the memory 22 and processor 21. This outputstage 23 is finally coupled to a storage memory 24 for storing thecompressed data and to a telemetry transmission unit 26, which forwardsthe data to a remote external programmer or controller. The input stage20 is also coupled to the storage memory 24 and telemetry transmissionunit 26 for transmitting uncompressed data. The encoding and decodingarrangement is preferably preceded by a preprocessing stage as shown inFIG. 7.

The function of this arrangement is basically as follows. A first symbolis read by the input stage 20. The processor 21 checks whether thissymbol corresponds to more than one symbol sequence in the codewordmemory 22 and if so, the next symbol is read. This process is repeateduntil the sequence of symbols read corresponds to only one codeword.This codeword is then emitted in place of the symbol sequence and iseither stored in the memory 24 or sent directly to an external devicevia the telemetry transmission unit 26. If a symbol read corresponds tono coded sequences it is transmitted unchanged to the storage memory 24or transmission unit 26 but preceded by the codeword 254 indicatinguncompressed data.

The addresses 0 to 255 of the codebook memory 22 are the codewords. Theprocessor 21 furthermore performs a mapping between the incoming symbolsthat form at least part of a coded sequence and each memory cell 221.This is illustrated in more detail in FIG. 9. The first symbol in asequence will cause the processor 21 to access the first memory cell, inascending order of address, to which the symbol is mapped. These firstmemory cells 221 correspond to the first branches in a probability tree.Thus if the first symbol in a sequence is ‘1’, the processor 21 willaccess the fourth memory location (address 3), since this is the firstlocation 221 to which a ‘1’ is mapped. Each memory cell 221 furthercontains information indicating whether the mapped symbol corresponds tomore than one coded symbol sequence. This information is shown on theright hand side of each memory cell 221. The information is in the formof a code represented by the numbers 0 to 7, which indicate both thenumber of possible further branches and the following symbolscorresponding to the further branches. A conversion table 25 shows thesignificance of each number. Hence it can be seen that ‘0’ indicates anend branch, ‘2’ represents two possible further branches with thefollowing symbols ‘1’ and ‘2’. Any symbol mapped to a memory cell 221containing a ‘0’ will result in the processor 21 transmitting theaddress of the memory cell as a codeword. Any other number contained inthe memory cell 221 indicates that at least one further branch ispossible. The processor 21 then fetches the next symbol from the inputstage 20 and determines whether this symbol forms part of the possiblecodewords. If it does, the processor 21 calculates the address of thenext memory cell 221 and the process is repeated.

The next address is calculated by summing the number of possiblebranches contained in all addresses starting from the first address upto the present address, and then adding the position of the fetchedsymbol in the list of possible symbols given in the conversion table 25.The sum of possible branches is equal to the sum of the first mappedmemory locations representing the initial branches of the probabilitytree and the possible branches stored in all previous memory locations.Thus an input sequence consisting of 1, 2 would result in a firstmapping being made to memory location 3. This contains the code 5, whichindicates that four further branches are possible. The next symbol isfetched. It is verified that ‘2’ corresponds to one of the possiblebranches. The symbol ‘2’ has the fourth position in the list of branches(−1, 0, 1, 2) given in the conversion table 25. Thus the sequence 1, 2is a valid coded sequence. The next address is equal to the sum of theinitial basic branches (i.e. the addresses to which a first symbol canbe mapped), which in the present example is 5 (addresses 0 to 4). Tothis is added the sum of the branches contained in memory locations upto address 3, and the position of the subsequent symbol in the listindicated in memory location 3. This gives a total of 5+(3+1+2)+4=15.The next address is thus the 15th location or address 14, since theaddresses start with 0. This is verified by the mapping of the symbol 2indicated in the left column of the cell with address 14.

The conversion table 25 alternatively may be more complex in structureand provide absolute memory locations corresponding to each possiblebranch. In this way the next address would not need to be calculated,but more storage capacity would be needed.

The conversion table is preferably stored in the processor 21. However,it may be possible to store some of the information about the branchesin the codebook memory 22, depending on how much capacity is availablefor each address.

The arrangement illustrated in FIG. 8 is a schematic representation ofpossible encoding and decoding hardware. It will however, be understoodthat the functions of the various elements shown in FIG. 8 may beperformed entirely in a digital processor system operating under thecontrol of a program. The codebook memory 22 could then be implementedvirtually from part of the memory space incorporated in the processorsystem.

FIG. 10 is a flow chart illustrating the procedure for compressing datausing the arrangement shown in FIGS. 8 and 9. The procedure starts atstep 30 with the reading of a symbol. In step 31 it is determinedwhether the symbol is inside the designated range, i.e. whether thesymbol forms part of a coded sequence and can be mapped to a memorylocation. If this is the case, the process moves to step 32 and the newmemory address is calculated. In the following step 33 a marker,‘codeword_started’, indicating that the coding of a sequence hasstarted, is set. In step 34 it is verified whether the data last sentwas uncompressed. If this is true, in step 35 the symbol indicatingcompression is sent. If the last data sent was in compressed mode, theprocess moves directly from step 34 to step 36, where the contents ofthe memory location are read and it is determined whether an end branchhas been reached. If the end branch has not been reached the procedurereturns to step 30 and the next symbol is fetched. If the end branch isreached, the procedure passes to step 38, where the memory address issent as the codeword. In step 39, the memory address is reset to 0 andin step 40, the marker ‘codeword_started’ is reset to false, because thecoding of a symbol or symbol sequence is terminated. The process thenreturns to step 30, and the next symbol fetched. If a symbol isdiscovered to be out of range in step 31, indicating that the symboldoes not form part of a coded sequence, the procedure goes to step 41where the status of the marker ‘codeword_started’ is verified. If thisis true, this means that a codeword has been started, but the subsequentsymbol does not form part of the coded sequence. In step 42, therefore,the memory address is sent as the codeword. The memory address is thenreset to 0 in step 43, the marker ‘codeword_started’ reset to false instep 44 and the procedure returns to step 31, where the fetched symbolis checked against the starting symbols of coded sequences to verify ifit forms part of a coded sequence. If in step 41, it is determined thatno codeword has been started, i.e. the status of the ‘codeword_started’marker is false, this means that the fetched symbol is not contained inany coded sequence. In step 45, the transmission mode is checked. If thelast data sent was compressed, the symbol indicating uncompressed datais sent in step 46 followed by the read symbol in step 47. If the lasttransmission was not in compression mode, the symbol is sent in step 47.The procedure then returns to the start at step 30.

Decompression is the exact reverse of the compression proceduredescribed above. Each codeword is converted into the correspondingsymbol or symbol sequence. The symbols are then summed to retrieve theoriginal uncompressed data.

FIGS. 11 a to 11 e illustrate an example using the coding algorithmdescribed above. The algorithm was first trained, i.e. the codebookgenerated, using an IEGM signal containing 10000 samples sampled at 512Hz with 8-bit resolution. The training signal is illustrated in FIG. 11a. The algorithm was then tested on an IEGM signal containing 10000samples sampled at 512 Hz with 8-bit resolution. This uncompressed testsignal is depicted in FIG. 11 b. FIG. 11 c shows the signal aftercompression. This signal contains 2149 samples which gives a compressionratio of 4.6. FIG. 11 d shows the signal of FIG. 11 c afterdecompression. Finally, FIG. 11 e shows the difference signal betweenthe original signal depicted in FIG. 11 b and the decompressed signal ofFIG. 11 d. The signal is entirely free of distortion.

In the coding procedures described above, a codeword is allocated toevery node and end branch of a probability tree. It will, however, beunderstood that codewords could be assigned only to the end branches ofthe tree.

Although modifications and changes may be suggested by those skilled inthe art, it is the intention of the inventor to embody within the patentwarranted hereon all changes and modifications as reasonably andproperly come within the scope of his contribution to the art.

1. A data coding method comprising the steps of: monitoring a datasignal containing a plurality of symbols and determining a plurality ofmost frequently occurring data components in said data signal, selectedfrom the group consisting of most frequently occurring symbols and mostfrequently occurring sequences of symbols containing at least twosymbols; allocating respective codewords to said most frequentlyoccurring data components, thereby obtaining a codeword set; and forminga compressed signal by substituting the respective codewords for saidmost frequently occurring data components; and said data signalincluding uncoded symbols that are not among said plurality of mostfrequently occurring symbols, and reserving at least one codeword insaid set as an indicator for said uncoded symbols.
 2. A method asclaimed in claim 1 wherein the step of monitoring said data signalcomprises monitoring said data signal during a predetermined timeperiod.
 3. A method as claimed in claim 1 wherein said uncoded symbolsinclude uncoded negative symbols, and comprising supplementing said atleast one codeword serving as said indicator for uncoded symbols with atleast one further codeword, for said uncoded negative symbols,indicative of a negative value.
 4. A method as claimed in claim 1wherein the step of allocating codewords comprises allocating codewordsto respective data components that are incorporated in other datacomponents having another codeword allocated thereto.
 5. A datacompression method comprising the steps of: converting a plurality ofmost frequently occurring data components in a data signal containing aplurality of symbols into respective codewords, said most frequentlyoccurring data components being selected from the group consisting ofmost frequently occurring symbols and most frequently occurringsequences of symbols containing at least two symbols; and designatingremaining symbols in said data signal, not among said most frequentlyoccurring data components, with at least one codeword indicative of nocompression; and substituting said codewords in place of said symbols.6. A method as claimed in claim 5 comprising setting a predeterminednumber and a predetermined length for said codewords.
 7. A method asclaimed in claim 5 comprising preprocessing an input signal containing aplurality of symbols to generate said data signal by generating anadditional symbol representing a difference between contiguous symbolsin said input signal.
 8. A method as claimed in claim 5 comprising theadditional steps of: reading a symbol in said data signal; determiningif the symbol that has been read corresponds to a codeword; andsubstituting said codeword for said symbol that has been read if saidsymbol that has been read corresponds to only one codeword.
 9. A methodas claimed in claim 8 wherein said symbol that has been read is a firstsymbol, and comprising the additional steps, if said first symbolcorresponds to more than one codeword, of: reading a subsequent symbolfollowing said first symbol; determining if said first symbol and saidsubsequent symbol correspond to a codeword; and substituting a codewordin place of said first symbol and said subsequent symbol if said firstsymbol and said subsequent symbol correspond to only one codeword.
 10. Amethod as claimed in claim 9 comprising the additional step, if saidsymbol that has been read corresponds to no codeword, retaining saidsymbol that has been read in said data signal.
 11. An arrangement forcompressing and decompressing a data signal, comprising: a memory forstoring codewords respectively corresponding to data components selectedfrom the group consisting of symbols and symbol sequences; and adetermination unit supplied with a data signal containing a plurality ofsymbols for determining if a symbol in said data signal corresponds to acodeword in said memory and, if a symbol corresponds to only onecodeword in said memory, transmitting that codeword in place of saidsymbol and transmitting said symbol if said symbol corresponds to nocodeword in said memory; and designating remaining symbols in said datasignal, not among said most frequently-occurring data components, withat least one codeword indicative of no compression.
 12. An arrangementas claimed in claim 11 wherein said memory includes a plurality ofmemory locations respectively designating codewords, and wherein eachmemory location contains an indication of a number of possible symbolsequences, and is mapped to a symbol of said data signal.
 13. Anarrangement as claimed in claim 12 further comprising a differencesymbol generator, connected preceding said determination unit, whichgenerates a difference symbol between contiguous symbols in said datasignal.
 14. An arrangement as claimed in claim 11 wherein said memorycomprises a plurality of memory locations having respective addresses,and wherein said addresses are said codewords.
 15. A computer-readablemedium encoded with a computer program product for converting a datasignal containing a plurality of symbols into a compressed signal, saidcomputer program, when said medium is loaded in a computer, causing thecomputer to: establish a set of codewords by determining a plurality ofmost frequently occurring data components in a data signal, said mostfrequently occurring data components being selected from the groupconsisting of most frequently occurring symbols and most frequentlyoccurring sequences of symbols containing at least two symbols; toallocate one codeword to each of said most frequently occurring datacomponents; and to designate remaining symbols in said data signal, notamong said most frequently occurring data components, with at least onecodeword indicative of no compression.
 16. A computer-readable medium asclaimed in claim 15 wherein said program code causes said computer tocompress said data signal by converting said most frequently occurringdata components into respective codewords by reading a symbol in saiddata signal and determining if said symbol corresponds to a codeword,and if so, emitting said codeword instead of said symbol and, if not,emitting said symbol.
 17. A data coding method comprising the steps of:monitoring a data signal containing a plurality of symbols anddetermining a plurality of most frequently occurring data components insaid data signal, said data components consisting of most frequentlyoccurring sequences of symbols containing at least two symbols;allocating respective codewords to said most frequently occurring datacomponents, thereby obtaining a codeword set; and forming a compressedsignal by substituting the respective codewords for said most frequentlyoccurring data components.
 18. A computer-readable medium encoded with acomputer program for converting a data signal containing a plurality ofsymbols into a compressed signal, said computer program, when saidmedium is loaded in a computer, causing the computer to: establish a setof codewords by determining a plurality of most frequently occurringdata components in a data signal, said most frequently occurring datacomponents consisting of most frequently occurring sequences of symbolscontaining at least two symbols; and to allocate one codeword to each ofsaid most frequently occurring data components.